Seaborn has many built-in capabilities for regression plots, however we won't really discuss regression until the machine learning section of the course, so we will only cover the lmplot() function for now.
lmplot allows you to display linear models, but it also conveniently allows you to split up those plots based off of features, as well as coloring the hue based off of features.
Let's explore how this works:
In [1]:
import seaborn as sns
%matplotlib inline
In [2]:
tips = sns.load_dataset('tips')
In [3]:
tips.head()
Out[3]:
In [5]:
sns.lmplot(x='total_bill',y='tip',data=tips)
Out[5]:
In [6]:
sns.lmplot(x='total_bill',y='tip',data=tips,hue='sex')
Out[6]:
In [13]:
sns.lmplot(x='total_bill',y='tip',data=tips,hue='sex',palette='coolwarm')
Out[13]:
lmplot kwargs get passed through to regplot which is a more general form of lmplot(). regplot has a scatter_kws parameter that gets passed to plt.scatter. So you want to set the s parameter in that dictionary, which corresponds (a bit confusingly) to the squared markersize. In other words you end up passing a dictionary with the base matplotlib arguments, in this case, s for size of a scatter plot. In general, you probably won't remember this off the top of your head, but instead reference the documentation.
In [16]:
# http://matplotlib.org/api/markers_api.html
sns.lmplot(x='total_bill',y='tip',data=tips,hue='sex',palette='coolwarm',
markers=['o','v'],scatter_kws={'s':100})
Out[16]:
In [28]:
sns.lmplot(x='total_bill',y='tip',data=tips,col='sex')
Out[28]:
In [30]:
sns.lmplot(x="total_bill", y="tip", row="sex", col="time",data=tips)
Out[30]:
In [24]:
sns.lmplot(x='total_bill',y='tip',data=tips,col='day',hue='sex',palette='coolwarm')
Out[24]:
In [36]:
sns.lmplot(x='total_bill',y='tip',data=tips,col='day',hue='sex',palette='coolwarm',
aspect=0.6,size=8)
Out[36]: